Pseudo-Labeling for Massively Multilingual Speech Recognition

Lugosch, Loren; Likhomanenko, Tatiana; Synnaeve, Gabriel; Collobert, Ronan

Computer Science > Computation and Language

arXiv:2111.00161 (cs)

[Submitted on 30 Oct 2021 (v1), last revised 8 Mar 2022 (this version, v3)]

Title:Pseudo-Labeling for Massively Multilingual Speech Recognition

Authors:Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

View PDF

Abstract:Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems. In this work, we extend pseudo-labeling to massively multilingual speech recognition with 60 languages. We propose a simple pseudo-labeling recipe that works well even with low-resource languages: train a supervised multilingual model, fine-tune it with semi-supervised learning on a target language, generate pseudo-labels for that language, and train a final model using pseudo-labels for all languages, either from scratch or by fine-tuning. Experiments on the labeled Common Voice and unlabeled VoxPopuli datasets show that our recipe can yield a model with better performance for many languages that also transfers well to LibriSpeech.

Comments:	Accepted to ICASSP 2022. New version has links to code/models + more training curves for larger model. (Fixed code link.)
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2111.00161 [cs.CL]
	(or arXiv:2111.00161v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2111.00161

Submission history

From: Loren Lugosch [view email]
[v1] Sat, 30 Oct 2021 03:30:17 UTC (1,475 KB)
[v2] Fri, 4 Mar 2022 21:20:10 UTC (1,475 KB)
[v3] Tue, 8 Mar 2022 14:48:41 UTC (1,475 KB)

Computer Science > Computation and Language

Title:Pseudo-Labeling for Massively Multilingual Speech Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Pseudo-Labeling for Massively Multilingual Speech Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators